A New Approach for Gene Annotation Using Unambiguous Sequence Joining

نویسندگان

  • Alexandre Tchourbanov
  • Daniel Quest
  • Hesham H. Ali
  • Mark A. Pauley
  • Robert B. Norgren
چکیده

The problem addressed by this paper is accurate and automatic gene annotation following precise identification/ annotation of exon and intron boundaries of biologically verified nucleotide sequences using the alignment of human genomic DNA to curated mRNA transcripts. We provide a detailed description of a new cDNA/DNA homology gene annotation algorithm that combines the results of BLASTN searches and spliced alignments. Compared to other programs currently in use, annotation quality is significantly increased through the unambiguous junction of genomic DNA sequences. We also address gene annotation with both non-canonic splice sites and short exons. The approach has been tested on the Genie learning subset as well as full-scale human RefSeq, and has demonstrated performance as high as 97%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach for Gene Annotation Using Unambiguous Sequence Joint

The problem addressed by this paper is accurate and automatic gene finding following precise identification/annotation of exon and intron boundaries for biologically verified nucleotide sequences, using the alignment of human genomic DNA to the curated mRNA transcript. We present a detailed description of a new cDNA/DNA homology gene annotation algorithm combining the results of BLASTN search w...

متن کامل

Complete Genomic Sequence of a Strain of Tomato Yellow Leaf Curl Virus from Iran

Background and Aims: Tomato yellow leaf curl virus (TYLCV) is one of the most destructive viruses of tomato that leads to reduced tomato yield up to 100% in tropical and subtropical regions. In this study, the complete sequence of TYLCV isolate from Hormozgan province, Iran and its recombination evsent was determined. Methods: TYLCV infected tomato was collected from Hormozgan province. Total D...

متن کامل

Gene assembling: a new approach in molecular diagnosis of hereditary breast cancer

 Abstract Background: Many disease susceptibility genes are large and consist of many exons in which point mutations are scattered throughout. Scanning each exon individually represents a tedious task which can be time consuming and expensive. There has been increasing demand for rapid and accurate methods for full scanning of unknown point mutations in large multi-exon genes. Gene Assembling i...

متن کامل

Annotation in Architecture: A Systematic Approach toward Mobilization and Development of Theoretical, Research, and Critical Basis in Architecture

Annotations usually refer to marginal notes that explain a difficult or ambiguous subject, provide a general definition or a critical remark for a particular part of a text. Historically, annotating was a well-known tradition in Islamic sciences and was used especially in times when there were less new potentials for generating new knowledge. The main question of this research is, can the tradi...

متن کامل

Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species

Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. IEEE Computer Society Bioinformatics Conference

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2003